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1 Introduction. 

The notion of a Markov chain is ubiquitous in linear algebra and probability 
books. For example see [3j Theorem 5.25] and p. 173]. Also, see (5j p. 131] 
for the history of the subject. A Markov (or stochastic) matrix is a square 
matrix whose entries are non- negative and whose column sums are equal to 1. 
The term stochastic matrix seems to prevail in current literature and therefore 
we use it in the title. But, since a Markov matrix is a transition matrix of a 
Markov chain, we prefer the term Markov matrix and we use it from now on. 
The theorem below gives some of the standard results for such matrices. 

Theorem 1.1. Let M be an nxn Markov matrix. Suppose that there exists 
peS such that all the entries of M p are positive. Then the following statements 
are true. 

(a) There exists a unique E £ K" such that 

ME = E and sumB = 1. 

(b) Let P be the square matrix each of whose columns is equal to E . Then P is 
a projection and PX = (sum X)E for each X £ K™. 

(c) The powers M k tend to P as k tends to +oo. 

The statement that all the entries of some power of M are positive is usually 
abbreviated by saying that M is regular. The fact that all the entries of E are 
positive is easily shown, since M P E = E,sumE = 1, and M p is positive. 

Theorem 11.11 follows readily from Theorem 14. 2\ the main result in this arti- 
cle. In Theorem 14.21 the requirement that the entries of M be non-negative is 
dropped, the requirement that the column sums be equal to 1 is retained, and 
the condition on M p is replaced by something completely different. However, 
the conclusions (jsj), jb|, (jg) hold true. Our proof is significantly different from 
all proofs of Theorem 11.11 that we are aware of. 

Here is an example: 
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2-4 
-1 -1 
6 4 9 
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Examining the first ten powers of M with a computer strongly suggests that 
the powers of M converge. Indeed, Theorem 14.21 applies here. For this, one 
must examine M 2 ; see Example 16.11 The limit is found, as in the case of a 
Markov matrix, by determining an eigenvector of M. It turns out that E = 
h\-Q 1 8l T ,and 



lim M k = - 



-6 -6 -6 
1 1 1 
8 8 8 



Since M is only a 3 x 3 matrix, we could show convergence by looking at the 
eigenvalues, which are 1,2/5, 1/5. 

Theorem 11.11 is often presented as an application of the Perron-Frobcnius 
Theorem. In [6], the authors give a version of the Perron-Frobcnius Theorem 
for matrices with some negative entries, but their results do not seem to be 
related to ours. 



2 Definitions. 

All numbers in this article are real, except in Example 16.41 We study mxn 
matrices with real entries. The elements of R' 1 will be identified with column 
matrices, that is, with n x 1 matrices. By J we denote any row matrix with all 
entries equal to 1. 

Let lei™ with entries x\,. . . ,x n . Set 

n n 

sumX — ^Xj, \\X\\ := ^ \ Xj \. 

Notice that sum! = JX and that || • || is the ^i-norm. 

For an m x n matrix A with columns A\,..., A n the variation (or column 
variation) of A is defined by: 

varA := - max \\Aj — ^4fcll- 
2 i<i, k<n" 11 

If the column sums of a matrix A are all equal to a, that is if J A = aJ, we 
say that A is of type a. 



3 Column variation and matrix type. 

In this section we establish the properties of the column variation and the matrix 
type that are needed for the proof of our main result. We point our that the 
restriction to real numbers in the theorem below is essential, as Example 16.41 
shows. 
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Theorem 3.1. Let A be an mxn matrix and X £ K". // sum X = 0, then 

\\AX\\ < (var A)\\X\\. 

Proof. In this proof, for any real number t we put t + := max{i,0} and t~ := 
max{-i,0}. Clearly t + ,t~ > and t = t+ - t~. 

Let Ai, . . . , A n be the columns of A. Assume sum! = 0. The conclusion is 
obvious if X = 0. 

Assume that 1^0. Then, by scaling, we can also assume that \\X\\ = 2. 
Let x±, . . . ,x n be the entries of X . Then 

n n 

k=l k=l 

Consequently 

n n n 

Ax = Y,x k A k = j2 x t A k - J2 x i A i- 

k=l fe=l 3=1 

Now we notice that AX is the difference of two convex combinations of the 
columns of A. From this, the inequality in the theorem seems geometrically 
obvious. However, we continue with an algebraic argument: 

n n n n 

AX = J2Y, x J x t A k - £ £ 4xJ A 3 

fc=l j=l j=l k=l 

n n 

k=l 3=1 

Consequently, 

n n 

im^hEE^II^-^-II 

fc=l 3=1 

n n 

<2(varA)^x+^x7 

fc=l 3 = 1 

= (varA)||Jf||. □ 

Proposition 3.2. Let A and B be matrices such that AB is defined. Lf B is of 
type b, then 

var(AB) < (var A) (var B). 

Proof. Assume that B is of type b and let B\ , . . . , Bi be the columns of B. Then 
AB\ , . . . , ABi are the columns of AB. Since B is of type b, for all j,k G {1, . . . , 1} 
we have sum(Bj — Bu) = 0. Therefore, by Theorem 13. 11 

\\AB 3 -AB k \\ < (var A) || - B k \\ 
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for all j, k £ {1, . . . , I}. Hence, 

var(AB) = - max II AB,- - AB k \\ 

< (var A) \ max \\Bj - B k \\ 

Z 1<J, k<l 

= (var A) (var B). □ 

Proposition 3.3. Let A and B be matrices such that AB is defined. If A is of 
type a and B is of type b, then AB is of type ab. 

Proof. If JA = aJ and JB = bJ, then J(AB) = (JA)B = aJB = abJ. □ 

4 Square matrices. 

In the previous section we considered rectangular matrices. Next we study 
square matrices. With one more property of matrix type, we shall be ready to 
prove our main result, Theorem 14.21 

Proposition 4.1. If M is a square matrix of type c, then c is an eigenvalue of 
M. 

Proof. Assume that JM = cJ. Then J(M — cl) = 0. That is, the sum of the 
rows of M — cl is and so the rows of M — cl are linearly dependent. Hence, 
M — cl is a singular matrix. □ 

Theorem 4.2. Let M be an nxn matrix. Suppose that M is of type 1 and that 
there exists p £ N such that var(Af p ) < 1. Then the following statements are 
true. 

(a) There exists a unique E £ R™ such that 

ME = E and sum_E = l. 

(b) Let P be the square matrix each of whose columns is equal to E . Then P is 
a projection and PX = (sumX)E for each X £ M. n . 

(c) The powers M k tend to P as k tends to +oo. 

Proof. Assume that M is of type 1 and that there exists p £ N such that 
var(M p ) < 1. By Proposition 14.11 there exists a nonzero Y £ R™ such that 
MY = Y. 

Clearly M V Y = Y. If sumF = 0, then, since Y ^ 0, Theorem O yields 

||F|| = \\M p Y\\ < var(M p )||r|| < ||Y||, 

a contradiction. Setting E = (1/ sumY)Y provides a vector whose existence 
is claimed in (jaj). To verify uniqueness, let F be another such vector. Then 
snm(E — F) = 0, M P (E - F) = E - F, and 

\\E - F\\ = \\M p (E - F)\\ < va,r(M p )\\E - F\\. 
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Consequently, E — F = 0, since var(M p ) < 1. 

By the definition of P in ©, P = EJ. Therefore, P 2 = (EJ){EJ) = 
E(JE)J = E[l]J = EJ = P. To complete the proof of ©, we calculate: 
PX = E(JX) = (snmX)E. 

Let k G N. Proposition 13.31 implies that M k is of type 1. By the division 
algorithm there exist unique q,r G Z such that k = pq + r and < r < p. Here 
q = [k/p\ is the floor of k/p. By Proposition 13. 21 

var(M fe ) < (var M) r (var(Af p )) 9 < (max (var M) r ) (var(Af p )) [k/pl . (4.1) 

Let X G M" be such that suml = 1. Then sum(X - E) = and Theorem GO 
implies that 

||Af fc X-£:|| = \\M k {X-E)\\ <var(Af fc )||X-£:||. (4.2) 
Now, since var(M p ) < 1, (|4.1|) implies that 

lim var(A/ fc ) = 0, 

and letting X in (14. 2\i run through the vectors in the standard basis of R™ proves 
0. □ 

Remark 4.3. If M is an nxn matrix of type 1, and the statements (jlj), ljb]). 
and (jcj) are true, then there exists p G N such that var(Af p ) < 1. This follows 
from the fact that the variation function is continuous on the space of all nxn 
matrices. 



5 Non- negative matrices. 

The propositions in this section are useful for showing that the variation of 
a Markov matrix is less than 1 . They are used to deduce Theorem 11.11 from 
Theorem 14.21 and also, repeatedly, in Example 16.31 

Proposition 5.1. Let A be an mxn matrix of type a with non-negative entries. 
Then var A < a. 

Proof. Let Ai,... ,A n be the columns of A. Since the entries of A are non- 
negative, ||Aj|| = a for all j G {1, . . . ,n}. Therefore, 

\\Aj-M < ll^'ll + IMMI=2a 

for all j, k G {1, . . . , n}, and the proposition follows. □ 

Proposition 5.2. Let a > 0. Let A be an mxn matrix of type a with non- 
negative entries. Then the following two statements are equivalent. 

(i) The strict inequality var A < a holds. 
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(ii) For each k,l G {1, . . . , n} there exists j G {1, . . . , to} such that the j-th 
entries of the k-th and l-th columns of A are both positive. 

Proof. Let A = [ctjk] and let Ai, . . . , A n be the columns of A. To prove that 
(0) and ([n]) are equivalent we consider their negations. By Proposition 15.11 and 
the definition, var A = a if and only if there exist ko, lo G {1, . . . , n} such that 

ll^feo - A, || = 2a - 

This is equivalent to 

m m 

J2\ a 3k ~ a jio | = ^2( a jk + a jio ). (5.1) 

3=1 3=1 

Since all the terms in the last equality are non-negative, (|5.ip is equivalent to 
ajk or a,ji being for all j € {1, . . . , to}. 

Hence, var A = a if and only if there exist fco,^o S {1, ...,n} such that 
for all j G {1, . . . , to} we have aj-; = 0. This proves that (0) and Ju]) are 
equivalent. □ 

Now we can give a short proof of Theorem 11.11 

Proof. Let M be a regular Markov matrix and assume that M p is positive. By 
Proposition 15.21 v&r(M p ) < 1. Therefore Theorem 14.21 applies. □ 



6 Examples. 



Example 6.1. In the Introduction we used 



M 



as an example of a matrix for which the powers converge. The largest £i-distance 
between two columns is between the second and the third, and is equal to 12/5. 
Therefore, varM = 6/5 > 1. But 



M 2 = — 
25 



-26 -18 
1 -1 
50 44 



-36 
4 

57 



and var(Af 2 ) = 18/25 < 1. Hence, Theorem 1421 applies. 

Example 6.2. For 2x2 matrices of type 1 it is possible to give a complete 
analysis. Let a, b G K and set 



M 



1 - a 
a 



b 

1-6 



and 



b. 



Then varM = |1 — c\. We distinguish the following three cases: 
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(i) c ^ 0. The eigenvalues of M are 1 and 1 — c, and the corresponding eigen- 

If < c < 2, then var A/ < 1. Consequently, 



vectors are 



and 



-1 



E=l 

c 



and M k converges. Otherwise, M k diverges. 



(ii) c = 0, a 7^ 0. In this case, var M = 1 and 1 is an eigenvalue of multiplicity 
2. It can be shown by induction that 



M k = 



k a 



-1 -1 
1 1 



So M k diverges. 

(iii) c = a = b = 0. So, M = I. 

Thus, for a 2x2 matrix M of type 1 which is not the identity matrix, M k 
converges if and only if var M < 1. Regular 2x2 Markov matrices were studied 
in gj. 

Example 6.3. Consider the following three kinds of Markov matrices: 





"1 + +" 




"+ + 0" 




"0 + 0" 


K = 


+ + 


, L = 


+ + 


, M = 


1 




+ 




+ + 




1 + 



Here we use + for positive numbers. All claims below about the variation rely 
on Propositions 15.11 and 15.21 

The matrix K is not regular, but var A' < 1. Also, E = [l 0] . 

The matrix L is not positive, but varL < 1. Also, Theorem 11.11 applies since 
L 2 is positive. 

The first five powers of M are: 



"0 + 0" 




"0 +" 




"+ + 0" 




"0 + +" 




"+ + +" 


1 




1 + 




+ + 




+ + + 




+ + + 


1 + 




+ + 




_+ + +_ 




_+ + +_ 




+ + + 



The variation of the first two matrices is 1, while var(M 3 ) < 1. The first positive 
power of M is M 5 . 

In fact, the following general statement holds. For a 3 x 3 Markov matrix 
M, the sequence M k , k £ N, converges to a projection of rank 1 if and only if 
var(Af 3 ) < 1. This was verified by examining all possible cases; see [1]. 

Example 6.4. In this example we consider matrices with complex entries. Let 
a = ( — 1 + i\/3)/2. Then 1, a, and a are the cube roots of unity. Notice that 
1 + a + a = 0, a 2 = a, and a 2 = a. 

The examples below were suggested by the following orthogonal basis for the 
complex inner product space C 3 : 





1 




"1" 




~r 


u = 


1 


, V = 


n 


, w = 


a 




1 




a 




a 
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We first give an example which shows that the conclusion of Theorem 13.11 
does not hold for matrices with complex entries. Set A = [l a a\. Then 
AV = [3], sumV = 0, var ,4 = V3/2 < 1, and 

||AV|| > (varA)||V||. 

Next we give an example showing that the restriction to real numbers cannot 
be dropped in Theorem 14.21 Consider the matrices 



P 



1 1 1 
1 1 1 
1 1 1 



and Q 



1 



1 a a 
a I a 
a a 1 



Notice that P is the orthogonal projection onto the span of U and Q is the 
orthogonal projection onto the span of V . Let c £ R and set 

AI = P + cQ. 

Then PV = and QV = V. Therefore MV = cV, showing that c is an 
eigenvalue of M. 

The matrix P is of type 1 with variation 0, while Q is of type with variation 
v3/2. Hence, M is of type 1 and 

varM = var(cQ) = \c\V3/2. 

Therefore, if 1 < c < 2/^3, the n varM < 1, but M diverges. 



7 The variation as a norm. 



The first proposition below shows that the variation function is a pseudo-norm 
on the space of all m x n matrices. The remaining propositions identify the 
variation of a matrix as the norm of a related linear transformation. 

Proposition 7.1. Let A be mxn matrix. 

(a) If c G R, then var(cA) = |c| var A 

(b) All columns of A are identical if and only if var A = 0. 

(c) If B is another mxn matrix, then var(/l + B) < var A + var B. 

Proof. The proofs of (jlj) and (0 are straightforward. To prove (jcj), let A\, . . . ,A n 
be the columns of A and let B±, . . . , B n be the columns of B. Then At + 
B\, . . . , A n + B n are the columns of A + B, and for all j,k £ {1, . . . , n}, 

U^ + B^-iAk + B^W < WAj-M + WBj-Bkl □ 

Proposition 7.2. Let A be an mxn matrix with more then one column. Then 

var A = max{||AY|| : X £ W\ \\X\\ = 1, suml = o}. (7.1) 
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Proof. It follows from Theorem 13.11 that the set on the right-hand side of (|7.1[) 
is bounded by var A To prove that var A is the maximum, let Aj and A 
be columns of A such that \\Aj — A n || = 2 var A Choose Xq € R" such that 
its jo-th entry is 1/2, its fco-th entry is —1/2 and all other entries are 0. Then 
\\AX \\ = var A ll^oll = 1 and sum(A ) = 0. □ 

Remark 7.3. Let V„ denote the set of all vectors X £ R™ such that suml = 0. 
This is a subspace of R" . An mxn matrix A determines a linear transforma- 
tion from V„ to R m . The previous proposition tells us that the norm of this 
transformation is var A 

Proposition 7.4. Let B be an mxn matrix of type b with more then one row. 
Then 

ysltB = max|var(Z J B) : Z T G R m , var Z = lj. (7.2) 

Proof. If var B = the statement follows from Proposition 13.21 So, assume 
var B > 0. By Proposition ^. 21 the set on the right-hand side of (|7.2p is bounded 
by varS. Let B = [bjk] and let Bk and Bi be columns of B such that 
•Bfco — Bio || = 2 vari?. Let Z be the row with entries defined by: 

f 1 if b jk > b jto , 

z 'j = \ 

{-I if b jko <bji . 

Since \\Bk — B[ \\ > and s\im(Bk ) = sum(£?/ ) = 0, there exist at least 
one positive and at least one negative entry in Zq. Therefore, var(Zo) = 1. 
By the definition of Zq the difference between fco-th and Iq-Hi entry in ZqB 
is \\Bk — -B; ||/2 = va,rB. Notice that if Z is a row matrix, then varZ = 
i (max Z — min Z) . Therefore, var(Zo-B) > var_B. This proves (|7.2p . □ 

Remark 7.5. Let R m denote the set of all row vectors with m entries. Denote 
by Im the subspace of all scalar multiples of J. An mxn matrix B of type b 
determines a linear transformation from the factor space R m /J m to the factor 
space R ra /J n in the following way: 

Z + I m ^ ZB + J n , ZeR m . 

It is easy to verify that this is a well-defined linear transformation. By Propo- 
sition [73 the norm of this transformation is exactly v&tB. 
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